Performance and power comparison of Thread Level Speculation in SMT and CMP architectures

نویسندگان

  • Venkatesan Packirisamy
  • Antonia Zhai
  • Wei-chung Hsu
  • Wei-Chung Hsu
  • Pen-Chung Yew
چکیده

As technology advances, microprocessors that support multiple threads of execution on a single chip are becoming increasingly common. Improving the performance of general purpose applications by extracting parallel threads is extremely difficult, due to the complex control flow and ambiguous data dependences that are inherent to these applications. Thread-Level Speculation (TLS) enables speculative parallel execution of potentially dependent threads, and ensures correct execution by providing hardware support to detect data dependence violations and to recover from speculation failures. TLS can be supported on a variety of architectures, among them are Chip MultiProcessors (CMP) and Simultaneous MultiThreading (SMT). While there have been numerous papers comparing the performance and power efficiency of SMT and CMP processors under various workloads, relatively little has been done to compare them under the context of TLS. While CMPs utilize smaller and more powerefficient cores, resource sharing and constructive interference between speculative and non-speculative threads can potentially make SMT more power efficient. Thus, this paper aims to fill this void by extending a CMP and a SMT processor to support TLS, and evaluating the performance and power efficiency of the resulting systems with speculative parallel threads extracted for the SPEC2000 benchmark suite. Both SMT and CMP processors have a large variety of configurations, we choose to conduct our study on two architectures with equal die area and the same clock frequency. Our results show that a SMT processor that supports four speculative threads outperforms a CMP processor that supports the same

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speculative Precomputation on Chip Multiprocessors

Previous work on speculative precomputation (SP) on simultaneous multithreaded (SMT) architectures has shown significant benefits. The SP techniques improve singlethreaded program performance by utilizing otherwise idle thread contexts to run “helper threads”, which prefetch critical data into shared caches and reduce the time the “main thread” stalls waiting for long latency outstanding loads....

متن کامل

Comparing the Energy Efficiency of CMP and SMT Architectures for Multimedia Workloads

Chip multiprocessing (CMP) and simultaneous multithreading (SMT) are two recently adopted techniques for improving the throughput of general-purpose processors by using multithreading. These techniques are likely to benefit the increasingly important real-time multimedia workloads, which are inherently multithreaded. These workloads, however, often run in an energy constrained environment. This...

متن کامل

Thread-Level Speculation on a CMP Can Be Energy Efficient

While Chip Multiprocessors (CMP) with Thread-Level Speculation (TLS) have become the subject of intense research, processor designers in industry have reservations about their practical implementation. An often cited complaint is that TLS is too energy-inefficient to compete against conventional superscalars. This paper challenges the commonly-held view that TLS is energy inefficient. We identi...

متن کامل

Dynamic Helper Threaded Prefetching on the Sun UltraSPARC CMP Processor

Data prefetching via helper threading has been extensively investigated on Simultaneous MultiThreading (SMT) or Virtual Multi-Threading (VMT) architectures. Although reportedly large cache latency can be hidden by helper threads at runtime, most techniques rely on hardware support to reduce context switch overhead between the main thread and helper thread as well as rely on static profile feedb...

متن کامل

Temperature-Aware Design Issues for SMT and CMP Architectures

With increasing power density in modern processors, management of on-chip temperature is fast becoming a bottleneck for chip designers. To address this, beyond conventional power and energy analysis it is necessary to apply temperature-aware analysis. In this paper we present thermal-aware experiments on simultaneous multithreaded (SMT) and chip multiprocessor (CMP) architectures. Both SMT and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007